Click Here!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Oracle Performance Tuning and Optimization
(Publisher: Macmillan Computer Publishing)
Author(s): Edward Whalen
ISBN: 067230886x
Publication Date: 04/01/96

Bookmark It

Search this book:
 
Previous Table of Contents Next


System Load

Like the DSS system, the CPUs in a data warehousing system are usually 100 percent active during the large business queries. Where OLTP systems have many users with small queries, the data warehousing system has relatively few users and massive queries (as does a DSS system). These queries should be able to take advantage of the capabilities of the CPUs and memory as long as the system does not become disk bound. By tuning the server using some of the concepts discussed in Chapters 9 and 10, you can avoid becoming disk bound.

The typical OLTP, batch, or DSS system may have an insufficient number of disk drives, which causes an I/O bottleneck. The data warehousing system may not have this problem because so much disk space is needed for the hundreds of gigabytes of historical and current business data, that, with careful planning, there are plenty of disk spindles to distribute the I/O load.

Following is a list of some of the load characteristics of a data warehousing system:

  Relatively few processes on the system. If you take advantage of the Parallel Query option, you add more processes and subsequently more process switches.
  Minimal network traffic. Network traffic is low during the transaction processing phases but may be significant during the data loading and updating phases.
  Heavy I/O usage. The decision support queries associated with the data warehouse usually generate large amounts of I/O to the data files. This I/O is somewhat random if multiple decision support queries are active simultaneously; but the I/Os are larger in size because of multiblock reads.
  Moderate to high redo log activity. Unlike the DSS system (where the redo log activity is very low), the redo log activity for a data warehouse may be moderate or even high. This is caused, not by the activity of the business transactions, but by the procedures necessary to prepare and load the data. The metadata may be constantly put together from many external sources.
  Moderate to heavy use of rollback segments. During the decision support queries, rollback segments will not be used heavily, but during the data-creation or conversion phase, rollback segment activity can be significant.
  Large amounts of memory. The memory is used not only for the SGA but for each of the server processes required for sort and join operations.

Defining and understanding these characteristics can help you design and tune your data warehouse for optimal performance. The first step in this design process is to set goals for what you want to achieve.

Goals

The goal in tuning the data warehouse is to achieve a system that has certain characteristics. Here is a list of the characteristics of an optimally tuned data warehouse:

  The system is CPU bound during decision support queries. By removing all other bottlenecks, the system should be able to process as fast as possible, which is the speed of the CPUs.
  The system is not drive bound. Any disk bottleneck degrades performance. If this is the case, you should add more or faster disks.
  Memory is sufficient. If the machine pages or swaps, performance is severely degraded. The best solution is to add more memory; if that is not possible, reduce the size of the SGA or the number of users until the system no longer pages or swaps.
  The system meets any additional requirements you might have. With some data warehousing machines, you must keep current with the OLTP systems by updating on a nightly basis. With the data warehouse, you may have to update the data within a certain time frame.

By setting goals for how you expect the system to perform, you can determine whether you are successful. You can also determine earlier whether you will be able to achieve your specified goals.

Review of Data Warehouse Characteristics

By analyzing how the system operates, how the queries work, and how the data is accessed, you are well on your way to determining the best configuration for your data warehouse. This information, together with the tuning procedures described in Part II of this book, can help you design an optimal configuration.

Data access in a data warehousing system is generally random, with some sequential components. By configuring the system and taking advantage of multiblock reads, you can achieve a high level of I/O performance.

Relate the information in the next part of this chapter to your particular configuration. Look for similarities and differences between what is described here and what you have observed about your system and decide how you can benefit from the tuning guidelines presented here. Although each system is different, many of the concepts remain the same.

You should base your system design on what you know about how your system operates and accesses data. Spend the time up front to carefully examine how the system needs to run so that you can determine the design of the system. The quality of the end product depends heavily on the amount of effort you put in at the beginning stages. No amount of tuning after the system is in production can make up for poor design choices.

Design Considerations

Looking at data access patterns can give you a good idea how to design the system. Before looking at the design process, consider these important issues, introduced earlier in this book:

  I/O is typically the limiting factor in the system. You can do only a fixed number of random I/Os per second per disk drive (refer to Chapter 14, “Advanced Disk I/O Concepts”).
  I/Os can be reduced by caching data blocks in the SGA. If the data you want to access is already in the SGA, a disk I/O is not required.
  Isolate sequential I/Os. Most of the time spent reading from or writing to the disk is spent seeking to where the data is located. If you can reduce seeks, you can achieve more I/Os per second.
  Spread out random I/Os. Random I/Os have a maximum rate per drive. By spreading the I/Os out among many drives, you increase the overall rate.
  Avoid paging and swapping. Any time the system pages or swaps, performance is severely degraded. Avoid this at all costs.

All these factors contribute to the optimal data layout of the system. The physical layout—along with SGA and shared pool tuning—creates an optimally configured server for the decision support tasks usually performed in the data warehouse. In data warehousing systems, the design of the queries is also very important, as you will see in later chapters.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited.